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FIGURE 1A 

Nucleotide Sequence of Human ABCG4 Transporter Gene 
Sequence Range: 1 to 3455 

GCCACCATGG CGGAGAAGGC GCTGGAGGCC GTGGGCTGTG GACTAGGGCC GGGGGCTGTG 60 
GCCATGGCCG TGACGCTGGA GGACGGGGCG GAACCCCCTG TGCTGACCAC GCACCTGAAG 120 
AAGGTGGAGA ACCACATCAC TGAAGCCCAG CGCTTCTCCC ACCTGCCCAA GCGCTCAGCC 180 
GTGGACATCG AGTTCGTGGA GCTGTCCTAT TCCGTGCGGG AGGGGCCCTG CTGGCGCAAA 240 
AGGGGTTATA AGACCCTTCT CAAGTGCCTC TCAGGTAAAT TCTGCCGCCG GGAGCTGATT 300 
GGCATCATGG GCCCCTCAGG GGCTGGCAAG TCTACATTCA TGAACATCTT GGCAGGATAC 360 
AGGGAGTCTG GAATGAAGGG GCAGATCCTG GTTAATGGAA GGCCACGGGA GCTGAGGACC 420 
TTCCGCAAGA TGTCCTGCTA CATCATGCAA GATGACATGC TGCTGCCGCA CCTCACGGTG 4 80 
TTGGAAGCCA TGATGGTCTC TGCTAACCTG AATCTTACTG AGAATCCCGA TGTGAAAAAC 540 
GATCTCGTGA CAGAGATCCT GACGGCACTG GGCCTGATGT CGTGCTCCCA CACGAGGACA 600 
GCCCTGCTCT CTGGCGGGCA GAGGAAGCGT CTGGCCATCG CCCTGGAGCT GGTCAACAAC 660 
CCGCCTGTCA TGTTCTTTGA TGAGCCCACC AGTGGTCTGG ATAGCGCCTC TTGTTTCCAA 720 
GTGGTGTCCC TCATGAAGTC CCTGGCACAG GGGGGCCGTA CCATCATCTG CACCATCCAC 780 
CAGCCCAGTG CCAAGCTCTT TGAGATGTTT GACAAGCTCT ACATCCTGAG CCAGGGTCAG 840 
TGCATCTTCA AAGGAGTGGT CACCAACCTG ATCCCCTATC TAAAGGGACT CGGCTTGCAT 900 
TGCCCCACCT ACCACAACCC GGCTGACTTC ATCATCGAGG TGGCCTCTGG CGAGTATGGA 960 
GACCTGAACC CCATGTTGTT CAGGGCTGTG CAGAATGGGC TGTGCGCTAT GGCTGAGAAG 1020 
AAGAGCAGCC CTGAGAAGAA CGAGGTCCCT GCCCCATGCC CTCCTTGTCC TCCGGAAGTG 1080 
GATCCCATTG AAAGCCACAC CTTTGCCACC AGCACCCTCA CACAGTTCTG CATCCTCTTC 1140 
AAGAGGACCT TCCTGTCCAT CCTCAGGGAC ACGGTCCTGA CCCACCTACG GTTCATGTCC 1200 
CACGTGGTTA TTGGCGTGCT CATCGGCCTC CTCTACCTGC ATATTGGCGA CGATGCCAGC 1260 
AAGGTCTTCA ACAACACCGG CTGCCTCTTC TTCTCCATGC TGTTCCTCAT GTTCGCCGCC 1320 
S3 CTCATGCCAA CTGTGCTCAC CTTCCCCTTA GAGATGGCGG TCTTCATGAG GGAGCACCTC 1380 

jg AACTACTGGT ACAGCCTCAA AGCGTATTAC CTGGCCAAGA CCATGGCTGA CGTGCCCTTT 14 4 0 

jfl CAGGTGGTGT GTCCGGTGGT CTACTGCAGC ATTGTGTACT GGATGACGGG CCAGCCCGCT 1500 

fl GAGACCAGCC GCTTCCTGCT CTTCTCAGCC CTGGCCACCG CCACCGCCTT GGTGGCCCAA 1560 

TCTTTGGGGC TGCTGATCGG AGCTGCTTCC AACTCCCTAC AGGTGGCCAC TTTTGTGGGC 1620 
fS CCAGTTACCG CCATCCCTGT CCTCTTGTTC TCCGGCTTCT TTGTCAGCTT CAAGACCATC 1680 

Q CCCACTTACC TGCAATGGAG CTCCTATCTC TCCTATGTCA GGTATGGCTT TGAGGGTGTG 174 0 

j"y ATCCTGACGA TCTATGGCAT GGAGCGAGGA GACCTGACAT GTTTAGAGGA ACGCTGCCCG 1800 

% TTCCGGGAGC CACAGAGCAT CCTCCGAGCG CTGGATGTGG AGGATGCCAA GCTCTACATG 1860 

M GACTTCCTGG TCTTGGGCAT CTTCTTCCTA GCCCTGCGGC TGCTGGCCTA CCTTGTGCTG 1920 

CGTTACCGGG TCAAGTCAGA GAGATAGAGG CTTGCCCCAG CCTGTACCCC AGCCCCTGCA 198 0 
jJJ) GCAGGAAGCC CCCAGTCCCA GCCCTTTGGG ACTGTTTTAA CCTTATAGAC TTGGGCACTG 2040 

ffj GTTCCTGGCG GGGCTATCCT CTCCTCCCTT GGCTCCTCCA CAGGCTGGCT GTCGGACTGC 2100 

GCTCCCAGCC TGGGCTCTGG GAGTGGGGGC TCCAGCCCTC CCCACTATGC CCAGGAGTCT 2160 
TCCCAAGTTG ATGCGGTTTG TAGCTTCCTC CCTACTCTCT CCAACACCTG CATGCAAAGA 2220 
CTACTGGGAG GCTGCTGCCT CCTTCCTGCC CATGGCACCC TCCTCTGCTG TCTGCCTGGG 2280 
AGCCCTAGGC TCTCTAGGGC CCCACTTACA ACTGACCAAA GTGGCCCCCT CTGGGGGTCC 2340 
CCACCACACA AGTGTTTGTA AACTGGGCTG CTATAAGGTT GGAGTTCCAG GGCTGGGCCC 2400 
TGGTGGAGTC CACTGGAAGT CCCATTATGG ATGTTGAAAT GGACAGGGAA GGACTCTGGA 24 60 
AGTCTCTTCC TCCTCCTCCT CTTCTCTCCA CCCCTAGACC CTGGCTGACT TGGACAATCT 2520 
GCCAGGACAG AAGCTGGGTT TTCTGTCTAG GTCACCACTC CCAATCCTGG GGATTGGAGA 2580 
GGCCTGGGGC TGTGGGATGC CCCATCCCCC TCCCCATCAC CTTTGGTGGG GGCAGGGCCT 2640 
GGTGGCACCT GTGCAATAAT GTCTGTGTTT CTCTCCCACC TGCCACTGGA ACT GG AGAAT 2700 
GCACTTTATT CTGGGCGGGG GGTGAGTGGG GGAAGACCCA ACCCTCCTTT CTCGCTGCCC 27 60 
CTAACGCATG CACGGTCTCG TGATGCTCCC TCCCTCTCCG GAGTGACAGG CACATACATG 2820 
AGAACAGGCC ATCTCAGCCC TACACACTTG CCATCCCCTA CAGCACAGAG GAAGAGTGAT 2880 
GGTGGCATGC TGGTGGTGGC GGGTGCTGGT GGGAGGACAG TGCCAACCTC CTCCTGGGGA 2940 



ft 
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FIGURE IB 



TCCCATGTTG GAGACTCTAA GGATAAGGCT 
AGGTGTCTAC CCCCAAGTCT TCCCTCCTCC 
CCTGGAGTTC AGGAACCAAC ACAAGCACAA 
CACCCACGGC CCTCCTTTTG TGCTCCATGC 
GCCACTGCTG CTCATTCAAA CTTCTGTCCA 
CCCCTGGGCA TCAGAACAGC CTGCCCTGGG 
CTGGCTTTCC TGGTGGGTCC AGGCTCATTC 
TTTTCCCCCT TTTTCCTGTA CACATCCCTG 
TCCTATCACA CAGGGATGCC AGTTGTATTT 



GGTGCTGCCC AGGGTGTCTA CAGGAACTGC 3000 
CAAGCCAGGG GTGGCACAGG GCACTAGATC 3060 
CCACGGGCAT AAGTTGGCCT TGGCCACTGC 3120 
TGGCATCTTC ACTCCCCTAC CCCTTCCCCA 3180 
TGTCCCTCCA CTGTTCCTAT CAGCAGGTGG 3240 
CACCAGGTGG CAGACACACT CAGAGCATGT 3300 
TGCTTCTGAT TTCCCCTCCC CCAGGGCTCA 3360 
TCTACCTCCT CTCACCCTGC CACAGATTCT 3420 
GTGGG 3455 
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FIGURE 2A 

Coding Sequence of Human ABCG4 Transporter Gene 
Sequence Range: 1 to 3455 

gcc acc atg gcg gag aag gcg ctg gag gcc gtg ggc tgt gga eta ggg ccg ggg get gtg 
Met Ala Glu Lys Ala Leu Glu Ala Val Gly Cys Gly Leu Gly Pro Gly Ala Val 

gcc atg gcc gtg acg ctg gag gac ggg gcg gaa ccc cct gtg ctg acc acg cac ctg aag 
120 

Ala Met Ala Val Thr Leu Glu Asp Gly Ala Glu Pro Pro Val Leu Thr Thr His Leu Lys 

aag gtg gag aac cac ate act gaa gcc cag cgc ttc tec cac ctg ccc aag cgc tea gcc 
180 

Lys Val Glu Asn His lie Thr Glu Ala Gin Arg Phe Ser His Leu Pro Lys Arg Ser Ala 

gtg gac ate gag ttc gtg gag ctg tec tat tec gtg egg gag ggg ccc tgc tgg cgc aaa 
240 

Val Asp He Glu Phe Val Glu Leu Ser Tyr Ser Val Arg Glu Gly Pro Cys Trp Arg Lys 

agg ggt tat aag acc ctt etc aag tgc etc tea ggt aaa ttc tgc cgc egg gag ctg att 
300 

Arg Gly Tyr Lys Thr Leu Leu Lys Cys Leu Ser Gly Lys Phe Cys Arg Arg Glu Leu He 

ggc ate atg ggc ccc tea ggg get ggc aag tct aca ttc atg aac ate ttg gca gga tac 
360 

Gly He Met Gly Pro Ser Gly Ala Gly Lys Ser Thr Phe Met Asn He Leu Ala Gly Tyr 

agg gag tct gga atg aag ggg cag ate ctg gtt aat gga agg cca egg gag ctg agg acc 
420 

Arg Glu Ser Gly Met Lys Gly Gin He Leu Val Asn Gly Arg Pro Arg Glu Leu Arg Thr 

ttc cgc aag atg tec tgc tac ate atg caa gat gac atg ctg ctg ccg cac etc acg gtg 
480 

Phe Arg Lys Met Ser Cys Tyr He Met Gin Asp Asp Met Leu Leu Pro His Leu Thr Val 
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FIGURE 2B 

ttg gaa gcc atg atg gtc tct get aac ctg aat ctt act gag aat ccc gat gtg aaa aac 
540 

Leu Glu Ala Met Met Val Ser Ala Asn Leu Asn Leu Thr Glu Asn Pro Asp Val Lys Asn 

gat etc gtg aca gag ate ctg acg gca ctg ggc ctg atg teg tgc tec cac acg agg aca 
600 

Asp Leu Val Thr Glu lie Leu Thr Ala Leu Gly Leu Met Ser Cys Ser His Thr Arg Thr 

gee ctg etc tct ggc ggg cag agg aag cgt ctg gcc ate gcc ctg gag ctg gtc aac aac 
660 

Ala Leu Leu Ser Gly Gly Gin Arg Lys Arg Leu Ala He Ala Leu Glu Leu Val Asn Asn 

ccg cct gtc atg ttc ttt gat gag ccc ace agt ggt ctg gat age gcc tct tgt ttc caa 
720 

Pro Pro Val Met Phe Phe Asp Glu Pro Thr Ser Gly Leu Asp Ser Ala Ser Cys Phe Gin 

gtg gtg tec etc atg aag tec ctg gca cag ggg ggc cgt ace ate ate tgc ace ate cac 
780 

Val Val Ser Leu Met Lys Ser Leu Ala Gin Gly Gly Arg Thr He He Cys Thr He His 

cag ccc agt gcc aag etc ttt gag atg ttt gac aag etc tac ate ctg age cag ggt cag 
840 

Gin Pro Ser Ala Lys Leu Phe Glu Met Phe Asp Lys Leu Tyr He Leu Ser Gin Gly Gin 

tgc ate ttc aaa gga gtg gtc ace aac ctg ate ccc tat eta aag gga etc ggc ttg cat 
900 

Cys He Phe Lys Gly Val Val Thr Asn Leu He Pro Tyr Leu Lys Gly Leu Gly Leu His 

tgc ccc ace tac cac aac ccg get gac ttc ate ate gag gtg gcc tct ggc gag tat gga 
960 

Cys Pro Thr Tyr His Asn Pro Ala Asp Phe He He Glu Val Ala Ser Gly Glu Tyr Gly 

gac ctg aac ccc atg ttg ttc agg get gtg cag aat ggg ctg tgc get atg get gag aag 
1020 

Asp Leu Asn Pro Met Leu Phe Arg Ala Val Gin Asn Gly Leu Cys Ala Met Ala Glu Lys 

aag age age cct gag aag aac gag gtc cct gcc cca tgc cct cct tgt cct ccg gaa gtg 
1080 

Lys Ser Ser Pro Glu Lys Asn Glu Val Pro Ala Pro Cys Pro Pro Cys Pro Pro Glu Val 
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FIGURE 2C 

gat ccc att gaa age cac acc ttt gec acc age ace etc aca cag ttc tgc ate etc ttc 
1140 

Asp Pro He Glu Ser His Thr Phe Ala Thr Ser Thr Leu Thr Gin Phe Cys He Leu Phe 

aag agg acc ttc ctg tec ate etc agg gac acg gtc ctg acc cac eta egg ttc atg tec 
1200 

Lys Arg Thr Phe Leu Ser He Leu Arg Asp Thr Val Leu Thr His Leu Arg Phe Met Ser 

cac gtg gtt att ggc gtg etc ate ggc etc etc tac ctg cat att ggc gac gat gee age 
1260 

His Val Val He Gly Val Leu He Gly Leu Leu Tyr Leu His He Gly Asp Asp Ala Ser 

aag gtc ttc aac aac acc ggc tgc etc ttc ttc tec atg ctg ttc etc atg ttc gee gee 
1320 

Lys Val Phe Asn Asn Thr Gly Cys Leu Phe Phe Ser Met Leu Phe Leu Met Phe Ala Ala 

etc atg cca act gtg etc acc ttc ccc tta gag atg gcg gtc ttc atg agg gag cac etc 
1380 

Leu Met Pro Thr Val Leu Thr Phe Pro Leu Glu Met Ala Val Phe Met Arg Glu His Leu 

aac tac tgg tac age etc aaa gcg tat tac ctg gee aag acc atg get gac gtg ccc ttt 
1440 

Asn Tyr Trp Tyr Ser Leu Lys Ala Tyr Tyr Leu Ala Lys Thr Met Ala Asp Val Pro Phe 

cag gtg gtg tgt ccg gtg gtc tac tgc age att gtg tac tgg atg acg ggc cag ccc get 
1500 

Gin Val Val Cys Pro Val Val Tyr Cys Ser He Val Tyr Trp Met Thr Gly Gin Pro Ala 

gag acc age cgc ttc ctg etc ttc tea gec ctg gec acc gec acc gee ttg gtg gee caa 
1560 

Glu Thr Ser Arg Phe Leu Leu Phe Ser Ala Leu Ala Thr Ala Thr Ala Leu Val Ala Gin 

tct ttg ggg ctg ctg ate gga get get tec aac tec eta cag gtg gee act ttt gtg ggc 
1620 

Ser Leu Gly Leu Leu He Gly Ala Ala Ser Asn Ser Leu Gin Val Ala Thr Phe Val Gly 

cca gtt acc gee ate cct gtc etc ttg ttc tec ggc ttc ttt gtc age ttc aag acc ate 
1680 
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FIGURE 2D 

Pro Val Thr Ala He Pro Val Leu Leu Phe Ser Gly Phe Phe Val Ser Phe Lys Thr He 

ccc act tac ctg caa tgg age tec tat etc tec tat gtc agg tat ggc ttt gag ggt gtg 
1740 

Pro Thr Tyr Leu Gin Trp Ser Ser Tyr Leu Ser Tyr Val Arg Tyr Gly Phe Glu Gly Val 

ate ctg acg ate tat ggc atg gag cga gga gac ctg aca tgt tta gag gaa cgc tgc ccg 
1800 

He Leu Thr He Tyr Gly Met Glu Arg Gly Asp Leu Thr Cys Leu Glu Glu Arg Cys Pro 

ttc egg gag cca cag age ate etc cga gcg ctg gat gtg gag gat gee aag etc tac atg 
1860 

Phe Arg Glu Pro Gin Ser He Leu Arg Ala Leu Asp Val Glu Asp Ala Lys Leu Tyr Met 

gac ttc ctg gtc ttg ggc ate ttc ttc eta gee ctg egg ctg ctg gee tac ctt gtg ctg 
1920 

Asp Phe Leu Val Leu Gly He Phe Phe Leu Ala Leu Arg Leu Leu Ala Tyr Leu Val Leu 

cgt tac egg gtc aag tea gag aga tag agg ctt gee cca gee tgt ace cca gec cct gca 
1980 

Arg Tyr Arg Val Lys Ser Glu Arg *** 

gca gga age ccc cag tec cag ccc ttt ggg act gtt tta ace tta tag act tgg gca ctg 
2040 

gtt cct ggc ggg get ate etc tec tec ctt ggc tec tec aca ggc tgg ctg teg gac tgc 
2100 

get ccc age ctg ggc tct ggg agt ggg ggc tec age cct ccc cac tat gec cag gag tct 
2160 

tec caa gtt gat gcg gtt tgt age ttc etc cct act etc tec aac acc tgc atg caa aga 
2220 

eta ctg gga ggc tgc tgc etc ctt cct gee cat ggc acc etc etc tgc tgt ctg cct ggg 
2280 

age cct agg etc tct agg gee cca ctt aca act gac caa agt ggc ccc etc tgg ggg tec 
2340 

cca cca cac aag tgt ttg taa act ggg ctg eta taa ggt tgg agt tec agg get ggg ccc 
2400 
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FIGURE 2E 

tgg tgg agt cca ctg gaa gtc cca tta tgg atg ttg aaa tgg aca ggg aag gac tct gga 
2460 

agt etc ttc etc etc etc etc ttc tct cca ccc eta gac cct ggc tga ctt gga caa tct 
2520 

gee agg aca gaa get ggg ttt tct gtc tag gtc ace act ccc aat cct ggg gat tgg aga 
2580 

ggc ctg ggg ctg tgg gat gee cca tec ccc tec cca tea cct ttg gtg ggg gca ggg cct 
2640 

ggt ggc ace tgt gca ata atg tct gtg ttt etc tec cac ctg cca ctg gaa ctg gag aat 
2700 

gca ctt tat tct ggg egg ggg gtg agt ggg gga aga ccc aac cct cct ttc teg ctg ccc 
2760 

eta acg cat gca egg tct cgt gat get ccc tec etc tec gga gtg aca ggc aca tac atg 
2820 

aga aca ggc cat etc age cct aca cac ttg cca tec cct aca gca cag agg aag agt gat 
2880 

ggt ggc atg ctg gtg gtg gcg ggt get ggt ggg agg aca gtg cca acc tec tec tgg gga 
2940 

tee cat gtt gga gac tct aag gat aag get ggt get gee cag ggt gtc tac agg aac tgc 
3000 

agg tgt eta ccc cca agt ctt ccc tec tec caa gec agg ggt ggc aca ggg cac tag ate 
3060 

cct gga gtt cag gaa cca aca caa gca caa cca egg gca taa gtt ggc ctt ggc cac tgc 
3120 

cac cca egg ccc tec ttt tgt get cca tgc tgg cat ctt cac tec cct acc cct tec cca 
3180 

gee act get get cat tea aac ttc tgt cca tgt ccc tec act gtt cct ate age agg tgg 
3240 

ccc ctg ggc ate aga aca gee tgc cct ggg cac cag gtg gca gac aca etc aga gca tgt 
3300 

ctg get ttc ctg gtg ggt cca ggc tea ttc tgc ttc tga ttt ccc etc ccc cag ggc tea 
3360 

ttt tec ccc ttt ttc ctg tac aca tec ctg tct acc tec tct cac cct gee aca gat tct 
3420 

tec tat cac aca ggg atg cca gtt gta ttt gtg gg 3455 
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FIGURE 3 

Predicted Protein Sequence of Human ABCG4 Transporter 

MAEKALEAVGCGLGPGAVAMAVTLEDGAEPPVLTTHLKKVENHITEAQRF 50 
SHLPKRSAVDIEFVELSYSVREGPCWRKRGYKTLLKCLSGKFCRRELIGI 100 



M j^gjMfFMNILAGYRESGMKGQILVNGRPRELRTFRKMSCYIMQDD 150 



Hllphltvleammvsanlnltenpdvkndlvteiltalglmscshtrtal 


200 




LSGGQRKRL AI AL EL VNN P P[7MFF D E ?? S GL DS AS C FQ VVS LMKS L AQG G 


250 




RTIICTIHQPSAKLFEMFDKLYILSQGQCIFKGVVTNLIPYLKGLGLHCP 


300 




TYHNPADFIIEVASGEYGDLNPMLFRAVQNGLCAMAEKKSSPEKNEVPAP 


350 




CPPCPPEVDPIESHTFATSTLTQFCILFKRTFLSILRDTVLTHLRFMSHV 


400 




[VIGVLIGLLYLHIGDDASKVFNNTGCLFFSMLFLMFAALMPTVLTFPLEM 


450 




AVFMREHLN YWYS LKAY YLAKTMAD VP FQ VVC PWYC S I VYWMTGQPAET 


500 




S RFLL FS ALAT AT AL VAQ S L GLL I GAAS N S LQ VAT F VG P VT AI P VLL FS G 


550 




IFFVSFKTIPTYLQWSSYLSYVRYGFEGVILTIYGMERGDLTCLEERCPFR 


600 




E PQS I LRALDVEDAKLYMDFLVLGI FFLALRLLAYLVLRYRVKS ER 


646 



M^WH Transmembrane domains are underiine] 

Walker A C signature WaIker~~B 
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FIGURE 4A 



ClustalW Multiple Sequence Alignment of the Members of the ABCG Subfamily 



ABCG1 


1 


MAAFSVGTAMNASSYSAEMTEPKSVCVSVDEVVSSNMEATETDLLNGHLKKVDNNLTEAQRFSSLPRRAAVNIEFRD) 


771 


ABCG4 


1 


MAEKALEAVGCGLGPGAVAMAVT LEDGAEPPVLTTHLKKVENHITEAQRFSHLPKRSAVDIEFVEj 


[65] 


ABCG2 


1 


MSSSNVEVFIP VSQGNTNGFPATVSN DLKAFTEGAVLSFHNICYR- 45[ 


ABCG5 
50] 


1 


MAGKAAEERGLPKGATPQDTSGLQDRLFS SESDNSLYFTYSGQPNTLEVR — DLNYQVDLASQVPWFEQLAQ] 


ABCG8 
7(5] 


1 





|ABCG1 78 LSYSVPEG^ 

[Hsl ^ 

|ABCG4 66 LSYSVREGPCWRKRGYKTLLKCLSGKFCRRELIGIMGBSG^GKSTFMNILAGYRE-SG~MKGQILVNGRPRELRTFRKMS| 

P3l , 

|ABCG2 46 VKLKSGFLPCR-KPVEKEILSNINGIMKPG-LNAIIj^^ 

[121] 

|ABCG5 51 VRPWWDITSCR-QQWTRQILKDVSLYVESGQIMCIL'^^MBLLDAMSGRLGRAGTFLGEVYVNGRALRREQFQDCFl 

|H , 

[ABCG8 71' FKMPWTSPSCQ — NSCELGIQNLSFKVRSGQMLAII'G^$^CGBA,SLLDVITGRGHGGKIKSGQIWINGQPSSPQLVRKCV| 

g ^ . 

|ABCG1 156 CYIMQDDMLLPHLTVQEAMMVSAHLKLQ— EKDEGRREMVKEILTALGLLSCANTRTGS g|||Mi!LAIALELV| 

|228| 



[ABCG4 144 CYIMQDDMLLPHLTVLEAMMVSANLNLT — ENPDVKNDLVTEILTALGLMSCSHTRTAL |ig|5^<$^LAIALELV[ 

[216] 

[ABCG2 122 GYVVQDDVVMGTLTVRENLQFSAALRLATTMTNHEK^^ 

[201] 

[ABCG5 130 SYVLQSDTLLSSLTVRETLHYTALLAIRR-GNPGSFQKKVEAVMAELSLSHVADRLIGNYSLGG;X$mEmEVSIAAQLL| 

[208] 

[ABCG8 149 AHVRQHNQLLPNLTVRETLAFI AQMRLPRTFSQAQRDKRVEDVI AELRLRQCADTRVGNMYVRGLSQGEFtERVS IGVQLL| 

[228] 

I j j ■ .: .» ■ j -1 

[ABCG1 229 NNPPMjgMlSGLDSASCFQVVSLMKGLAQGGRSIICTIHQPSAKLFELFDQLYVLSQGQCVYRGKVCNLVPYLRDLGl 
[3081 

[ABCG4 217 NNPP|j|ipgiSGLDSASCFQVVSLMKSLAQGGRTIICTIHQPSAKLFEMFDKLYILSQGQCIFKGVVTNLIPYLKGLG| 



|ABCG2 202 TDPS|ij^l^ifTGLDSSTANAVLLLLKRMSKQGRTIIFSIHQPRYSIFKLFDSLTLLASGRLMFHGPAQSALGYFESAG| 

us , 

|ABCG5 209 QDPKViii^v£gTTGLDCMTANQIVVLLVELARRNRIVVLTIHQPRSELFQLFDKIAILSFGELIFCGTPAEMLDFFNDCG| 

[288] _____ 

[ABCG8 229 WNPGXIyllp^SGLDSFTAHNLVKTLSRLAKGMRLVLISLHQPRSDIFRLFDLVLLMTSGTPIYLGAAQHMVQYFTAIGl 

[3081 "~" ^ ~ 

| : : . . :| 

[ABCG1 309 LNCPTYHNPADFVMEVASGEYGD— QN-SRLVRAVREGMCDSDHKRDLGGDAEVNPFLWHRPSEEVKQTKRLKGLRKDS^ 

S , 

[ABCG4 297 LHCPTYHNPADFI I EVASGEYGD — LN- PMLFRAVQNGLCAMAEK KSSPEKNEVPAPCPPCPPEH 

1357] 

|abcg2 ; 282 yhceaynnpadffldiingdstavalnreedfkateiiepskqdkp-lieklaeiyvnssfyketkaelhqlsggekkkk| 
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FIGURE 4B 



|ABCG5 289 YPCPEHSNPFDFYMDLTSVDTQSKERE-IETSKRVQMIESAYKKS AICHKTLKNIERMKHLKTLP-| 

in . 

309 YPCPRYSHPADFYVDLTSIDRRSREQE-LATREKAQSLAALFLEKVR DLDDFLWKAETKDLDEDTCVESSVTPLD| 

= =: ••■ ] 



BCG8 



ABCG1 



4 62 



AB CG4 358 VDPIESHTFATSTLTQFCILFKRTFLSILRDTVLTHLRFMSHVVIGVLIGLLYLHIGDDASK— VFNNTGCLFFSMLFlii| 
ABCG2 3 61 ITVFKEISYTTSFCHQLRWVSKRSFKNLLGNPQASIAQIIVTVVLGLVIGAIYFGLKNDSTG— IQNRAGVLFFLTTNQCl 



385 SSMEGCHSFSASCLTQFCILFKRTFLSIMRDSVLTHLRITSHIGIGLLIGLLYLGIGNETKK VLSNSGFLFFSMLFLM| 



138 



ABCG5 
43l| 



353 mvpfktkd-spgvfsklgvllrrvtrnlvruklavitrllqnlimglfllffvlrvrsnvlkgaiqdrvgllyqfvgatpI 



ABCG8 



383 TNCLPSPTKMPGAVQQFTTLIRRQISNDFRDLPTLLIHGAEACLMSMTIGFLYFGHGSIQLS--FMDTAALLFMIGALIP1 

■ • ■: . . . . — rn 



ABCG1 



ABCG4 



ABCG2 



463 faalmptvltfplemgvflrehlnywyslkayylaktmadvpfqimfp-vaycsivywmtsqpsdavrfvlfaalgtmtsI 

436 FAALMPTVLTFPLE3^VFMREHLNYWYSLKAYYLAKTMADVPFQVVCP'VVYCSIVYWMTGQPAETSRFLLFSALATATA] 



439 FSSVS-AVELFVVEKKLFIHEYISGYYRVSSYFLGKLLSDLLPMRMLPSIIFTCIVYFMLGLKPKADAFFVI^FTLI^V^ 



517 



z^BCGE 432 YTGMLNAVNLFPVLRAVSDQESQDGLYQKWQMMLAYALHVLPFSVVAT-MIFSSVCYWTLGLHPEVARFGYFSAALLAPH| 
|ABCG8 461 FNVILDVISKCYSERAMLYYELEDGLYTTGPYFFAKILGELPEHCAYI-IIYGMPTYWLANLRPGLQPFLLHFLLVWLVVl 



[ABCGl 



BCG4 



[ABCG2 



|ABCG5 



[ABCG8 



|ABCG1 



|ABCG4 



|ABCG2 



|ABCG5 



[ABCG8_ 



542 LVAQSLGLLIG-AASTSLQVATFVGPVTAIPVLLFSGFFVSFDTIPTYLQWMSYISYVRYGFEGVILSIYG L| 

515 LVAQSLGLLIG-AASNSLQVATFVGPVTAIPVLLFSGFFVSFKTIPTYLQWSSYLSYVRYGFEGVILTIYG ^El 

518 YSASSMALAIA-AGQSWSVATLLMTICFVFiyMIFSGLLVNLTTIA$WLSWLQYFSIPRYGFTALQHNEFLGQNFCPGLN| 
511 LIGEFLTLVLLGIVQNPNIVNSVVALLSIAGVLVGSGFLRNIQEMPIPFKIISYFTFQKYCSEILVVNEFYGLN FTC| 
540 FCCRIMALAAA-ALLPTFHMASFFSNALYNSFYLAGGFMINLSSLWTVPAWISKVSFLRWCFEGLMKIQFS R| 

■ • ~ — n 

613 DREDLHCDIDETCHFQ-KSEAILRELDVENAKLYLDFIVLGIFFISLRLIAYLVLRYKIRAER 674| 
586 ERGDLTC-LEERCPFR-EPQSILRALDVEDAKLYMDFLVLGIFFLALRLLAYLVLRYRVKSER 64 6[ 
597 ATGNNPC-NYATCTG— EEYLVKQGIDLSPWGLWKNHVALACMIVIFLTIAYLKLLFLKKYS 655] 
588 GSSNVSVTTNPMCAFTQGIQFIEKTCPGATSRFTMNFLILYSFIPALVILGIVVFKIRDHLISR 65l| 
611 RTYKMPLGNLTIAVS GDKILSVMELDSYPLYAIYLIVIGLSGGFMVLYYVSLRFIKQKPSQDW 673| 
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FIGURE 5 
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01 
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ClustalW Multiple Sequence Alignment of Partial Human ABCG4 Transporter in GenBank 



(AN: CAC17140) and Human ABCG4 Transporter of this Invention 



ABCG4 vs. CAC17140 



|RBCG4 




1 MAEKRLEAVGCGLGPGAVRMAVTLEDGaEPPVLTTHLKKVENHITEAQRFSHLPKRSAVDIEFVELSYSVREGPCWRKRG| 
~ MaVTLEDGREPPVLTTHLKKVENHITEaQRFSHLPKRSaVDIEFVELSYSVREGPCWRKRGl 



3 



[ABCG4 



81 YKTLLKCLSGKFCRRELI: 



FMNILAGYRESGMKGQILVNGRPRELRTFRKMSCYIMQDDMLLPHLTVLEl 



160 



|CAC17140 62 Y KT L L K C L S G K F C RP. ELI GtEllpii^^S^:^ EMN I LAG Y RE S GM KG Q I L V K G R ? R E L R7 F RKM 3 C Y 1 M Q D DML L P K L T V LFi| 
Ml - 



L 



3 



ABCG4 



240 



161 AMMVSANLNLTENPDVKNDLVTEILTALGLMSCSH^ 

LA IALELVNNPPVMEFBlgfSGLDSASCFQVV| 



CAC17140 142 ammvsanlklsekqevkkelvteiltalglmscshtrta: 



22ll 



3 



[ABCG4 



AC17140 



|CACj 

m 



241 SLMKSLAQGGRTIICTIHQPSAKLFEMFDKLYILSQGQCIFKGVVTNLIPYLKGLGLHCPTYHNPADFIIEVASGEYGDLl 
222 SLMKSLAQGGRTIICTIHQPSAKLFEMFDKLYILSQGQCIFKGWTHLIPYLKGLGLHCPTYHNPADFIIEVASGEYGDLl 



3 



|ABCG4 321 NPMLFRAVQNGLCAMAEKKSSPEKNEVPAPCPPCPPEVDPIESHTFATSTLTQFCILFKRTFLSILRDTVLTHLRFMSHVl 
|400| 

|CAC17140 302 NPMLFRAVQNGLCAMAEKKSSPEKNEVPAPCPPCPPEVDPIESHTFATSTLTQFCILFKRTFLSILRDTVLTHLRFMSHV| 



L 



|ABCG4 401 VIGVLIGLLYLHIGDDASKVFNNTGCLFFSMLFLMFAALMPTVLTFPLEMAVFMREHLNYWYSLKAYYLAKTMADVPFQV| 

po| 

|CAC17140 382 VIGVLIGLLYLHIGDDASKVFNNTGCLFFSMLFLMFAALMPTVLTFPLEMAVFMREHLNYWYSLKAYYLAKTMADVPFQV| 



3 



[ABCG4 481 vcpvvycsivywmtgqpaetsrfllfsalatatalvaqslglligaashslqvatfvgpvtaipvllfsgffvsfktipt| 

l5 60| 

1CAC17140 462 VCPVVYCSIVYWMTGQPAETSRFLLFSALATATALVAQSLGLLIGAASNSLQVATFVGPVTAIPVLLFSGFFVSFKTIPT[ 



BCG4 561 YLQWSSYLSYVRYGFEGVILTIYGMERGDLTCLEERCPFREPQSILRALDVEDAKLYMDFLVLGIFFLALRLLAYLVLRYl 
1CAC1714Q 542 YLQWSSYLSYVRYGFEGVILTIYGMERGDLTCLEERCPFREPQSILRALDVEDAKLYMDFLVLGIFFLALRLLAYLVLRYl 



[ABCG4 641 RVKSER 64 6| 
ICAC17140 622 RVKSER 62 7j 
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Figure 7 
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Figure 9 
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Directory: Mapleleaf/target clone/ABCG family/abcg4/g4cdna (Assembly file) 



G4-clone nucleotide sequence range 1 to 2687 

taccgagctcggatccactagtccagtgtggtggaattgcccttgccacc^|gcggagaaggcgctggagg 

CCGTGGGCTGTGGACTAGGGCCGGGGGCTGTGGCCATGGCCGTGACGCTGGAGGACGGGGCGGAACCCCCTG 
TGCTGACCACGCACCTGAAGAAGGTGGAGAACCACATCACTGAAGCCCAGCGCTTCTCCCACCTACCCAAGC 
GCTCAGCCGTGGACATCGAGTTCGTGGAGCTGTCCTATTCCGTGCGGGAGGGGCCCTGCTGGCGCAAAAGGG 
GTTATAAGACCCTTCTCAAGTGCCTCTCAGGTAAATTCTGCCGCCGGGAGCTGATTGGCATCATGGGCCCCT 
CAGGGGCTGGCAAGTCTACATTCATGAACATCTTGGCAGGATACAGGGAGTCTGGAATGAAGGGGCAGATCC 
T G GT T AAT GG AAGGC CACGGGAGCT GAGGACC T T C C GCAAGAT GT CC T GC T AC AT CAT GC AAGAT GACAT GC 
TGCTGCCGCACCTCACGGTGTTGGAAGCCATGATGGTCTCTGCTAACCTGAAGCTGAGTGAGAAGCAGGAGG 
TGAAGAAGGAGCTGGTGACAGAGATCCTGACGGCACTGGGCCTGATGTCGTGCTCCCGCACGAGGACAGCCC 
TGCTCTCTGGCGGGCAGAGGAAGCGTCTGGCCATCGCCCTGGAGCTGGTCAACAACCCGCCTGTCATGTTCT 
TTGATGAGCCCACCAGTGGTCTGGATAGCGCCTCTTGTTTCCAAGTGGTGTCCCTCATGAAGTCCCTGGCAC 
AGGGGGGCCGTACCATCATCTGCACCATCCACCAGCCCAGTGCCAAGCTCTTTGAGATGTTTGACAAGCTCT 
ACATCCTGAGCCAGGGTCAGTGCATCTTCAAAGGCGTGGTCACCAACCTGATCCCCTATCTAAAGGGACTCG 
GCTTGCATTGCCCCACCTACCACAACCCGGCTGACTTCATCATCGAGGTGGCCTCTGGCGAGTATGGAGACC 
TGAACCCCATGTTGTTCAGGGCTGTGCAGAATGGGCTGTGCGCTATGGCTGAGAAGAAGAGCAGCCCTGAGA 
AGAACGAGGTCCCTGCCCCATGCCCTCCTTGTCCTCCGGAAGTGGATCCCATTGAAAGCCACACCTTTGCCA 
CCAGCACCCTCACACAGTTCTGCATCCTCTTCAAGAGGACCTTCCTGTCCATCCTCAGGGACACGGTCCTGA 
CCCACCTACGGTTCATGTCCCACGTGGTTATTGGCGTGCTCATCGGCCTCCTCTACCTGCATATTGGCGACG 
ATGCCAGCAAGGTCTTCAACAACACCGGCTGCCTCTTCTTCTCCATGCTGTTCCTCATGTTCGCCGCCCTCA 
TGCCAACTGTGCTCACCTTCCCCTTAGAGATGGCGGTCTTCATGAGGGAGCACCTCAACTACTGGTACAGCC 
TCAAAGCGTATTACCTGGCCAAGACCATGGCTGACGTGCCCTTTCAGGTGGTGTGTCCGGTGGTCTACTGCA 
GCATTGTGTACTGGATGACGGGCCAGCCCGCTGAGACCAGCCGCTTCCTGCTCTTCTCAGCCCTGGCCACCG 
CCACCGCCTTGGTGGCCCAATCTTTGGGGCTGCTGATCGGAGCTGCTTCC7\ACTCCCTACAGGTGGCCACTT 
TTGTGGGCCCAGTTACCGCCATCCCTGTCCTCTTGTTCTCCGGCTTCTTTGTCAGCTTCAAGACCATCCCCA 
CTTACCTGCAATGGAGCTCCTATCTCTCCTATGTCAGGTATGGCTTTGAGGGTGTGRTCCTGACGATCTATG 
GCATGGAGCGAGGAGACCTGACATGTTTAGAGGAACGCTGCCMGTTCCGGGAGCCACAGAGCATCCTCCGAG 
CGCTGGATGTGGAGGATGCCAAGCTCTACATGGACTTCCTGGTCTTGGGCATCTTCTTCCTAGCCCTGCGGC 
TGCTGGCCTACCTTGTGCTGCGTTACCGGGTCAAGTCAGAGAGAfilGAGGCTTGCCCCAGCCTGTACCCCAG 
CCCCTGCAGCAGGAAGCCCCCAGTCCCAGCCCTTTGGGACTGTTTTAACCTTATAGACTTGGGCACTGGTTC 
CTGGCGGGGCTATCCTCTCCTCCCTTGGCTCCTCCACAGGCTGGCTGTCGGACTGCGCTCCCAGCCTGGGCT 
CTGGGAGTGGGGGCTCCAGCCCTCCCCACTATGCCCAGGAGTCTTCCCAAGTTGATGCGGTTTGTAGCTTCC 
TCCCTACTCTCTCCAACACCTGCATGCAAAGACTACTGGGAGGCTGCTGCCTCCTTCCTGCCCATGGCACCC 
TCCTCTGCTGTCTGCCTGGGAGCCCTAGGCTCTCTAGGGCCCCACTTACAACTGACCAAAGTGGCCCCCTCT 
KGGGGTCCCCACCACACAAGTGTTTGTAAACTGGGCTGCTATAAGGTTGGAGTTCCAGGGCTGGGCCCTGGT 
GGAGTCCACTGGAAGTCCCATCATGGATGTTGAAATGGACAGGGAAGGACTCTGGAAGTCTCTTCCTCCTCC 
TCCTCTTCTCTCCACCCCTAGACCCTGGCTGACTTGGACAATCTGCCAGGACAGAAGCTGGGGTTTTCTGTC 
TAGGTCACCACTCCCAATCCTGGGGGRTTGGAGRGGCCTGGGGSTGTGGGRTGSCCCATCCCCCTCCCCATC 
ACCTTTGGTGGGGGSAGGGCCTG 
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G4-clone polypeptide sequence range 1 to 646 

MAEKALEAVGCGLGPGAVAMAVTLEDGAEPPVLTTHLKKVENHITEAQRFSHLPKRSAVD 
IEFVELSYSVREGPCWRKRGYKTLLKCLSGKFCRRELIGIMGPSGAGKSTFMNILAGYRE 
SGMKGQILVNGRPRELRTFRKMSCYIMQDDMLLPHLTVLEAMMVSANLKLSEKQEVKKEL 
VTEILTALGLMSCSRTRTALLSGGQRKRLAIALELVNNPPVMFFDEPTSGLDSASCFQVV 
SLMKSLAQGGRTIICTIHQPSAKLFEMFDKLYILSQGQCIFKGVVTNLIPYLKGLGLHCP 
TYHNPADFIIEVASGEYGDLNPMLFRAVQNGLCAMAEKKSSPEKNEVPAPCPPCPPEVDP 
IESHTFATSTLTQFCILFKRTFLSILRDTVLTHLRFMSHVVIGVLIGLLYLHIGDDASKV 
FNNTGCLFFSMLFLMFAALMPTVLTFPLEMAVFMREHLNYWYSLKAYYLAKTMADVPFQV 
VCPWYCSIVYWMTGQPAETSRFLLFSALATATALVAQSLGLLIGAASNSLQVATFVGPV 
TAIPVLLFSGFFVSFKTIPTYLQWSSYLSYVRYGFEGVXLTIYGMERGDLTCLEERCXFR 
E PQ S I LRAL DVE DAKL YMDFLVLG I FFL ALRLLAYLVLRYRVKS E R 



Figure 1 1 



